Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changed the parse_run_traces to include last 4 letters of run_id #1880

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

Vidit-Ostwal
Copy link
Contributor

#1871
Changed the parse_run_traces function, to include the last 4 letter of the run_id of the trace, result of which is a unique key for every call.

@dosubot dosubot bot added the size:XS This PR changes 0-9 lines, ignoring generated files. label Jan 25, 2025
@Vidit-Ostwal
Copy link
Contributor Author

Hi @jjmachan, can you help me with this one, all the unit test have failed because of the PR #1879

@Vidit-Ostwal
Copy link
Contributor Author

The code changed was small
From this

def validate_samples(self, samples: t.List[Sample]) -> t.List[Sample]:
    """Validates that all samples are of the same type."""
    if len(samples) == 0:
        return samples

    first_sample_type = type(self.samples[0])
    if not all(isinstance(sample, first_sample_type) for sample in self.samples):
        raise ValueError("All samples must be of the same type")

    return samples

to this


def validate_samples(self, samples: t.List[Sample]) -> t.List[Sample]:
    """Validates that all samples are of the same type."""
    if len(samples) == 0:
        return samples

    first_sample_type = type(samples[0])
    for i, sample in enumerate(samples):
        if not isinstance(sample, first_sample_type):
            raise ValueError(f"Sample at index {i} is of type {type(sample)}, expected {first_sample_type}")

    return samples

which does not change much of the functionality of comparing classes while iterating

@Vidit-Ostwal
Copy link
Contributor Author

Hi @jjmachan, does this solution align with with the given issue?

@@ -164,7 +164,7 @@ def parse_run_traces(
prompt_trace = traces[prompt_uuid]
output = prompt_trace.outputs.get("output", {})
output = output[0] if isinstance(output, list) else output
prompt_traces[f"{prompt_trace.name}"] = {
prompt_traces[f"{prompt_trace.name}_{prompt_trace.run_id[:4]}"] = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
prompt_traces[f"{prompt_trace.name}_{prompt_trace.run_id[:4]}"] = {
prompt_traces[f"{prompt_trace.name}_{i}"] = {

what if we use i instead? would that be better?

right now I think it might be a bit more confusing, since users have no way of guessing the run_id. this would make it a bit more predictable - what do you think?

ps: Thanks a lot for taking the time to fix it 🙂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jjmachan

Agreed the user won't have a clue about what run_id it is and therefore can be bit confusing, If we append the {i},
The output would be more like

{'claim_decomposition_prompt_1': {'input': ClaimDecompositionInput(response='Eifel tower is in Paris', sentences=['Eifel tower is in Paris']), 'output': ClaimDecompositionOutput(decomposed_claims=[['Eiffel
tower is in Paris']])}, 'n_l_i_statement_prompt_2': {'input': NLIStatementInput(context='Paris, France is a city where Eifel tower is located', statements=['Eiffel tower is in Paris']), 'output':
NLIStatementOutput(statements=[StatementFaithfulnessAnswer(statement='Eiffel tower is in Paris', reason='The context explicitly states that Eiffel tower is located in Paris, France.', verdict=1)])},
'claim_decomposition_prompt_3': {'input': ClaimDecompositionInput(response='Paris, France is a city where Eifel tower is located', sentences=['Paris, France is a city where Eifel tower is located']),
'output': ClaimDecompositionOutput(decomposed_claims=[['Paris is a city in France.'], ['Eiffel Tower is located in Paris.']])}, 'n_l_i_statement_prompt_4': {'input': NLIStatementInput(context='Eifel tower is
in Paris', statements=['Paris is a city in France.', 'Eiffel Tower is located in Paris.']), 'output': NLIStatementOutput(statements=[StatementFaithfulnessAnswer(statement='Paris is a city in France.',
reason='The context does not provide any information about the location of Paris.', verdict=0), StatementFaithfulnessAnswer(statement='Eiffel Tower is located in Paris.', reason='The context explicitly states
that the Eiffel Tower is in Paris.', verdict=1)])}}

This solution is more user-centric, considering that the last digit simply represents a sequence of events occurring within Ragas' reasoning process.

@Vidit-Ostwal
Copy link
Contributor Author

Vidit-Ostwal commented Feb 2, 2025

Hi @jjmachan, updated the traces functionality to have {prompt_number} : {trace},

New traces should look like

{'prompt_1': {'name': 'claim_decomposition_prompt', 'input': ClaimDecompositionInput(response='Eifel tower is in Paris'), 'output': ClaimDecompositionOutput(claims=['Eiffel tower is in Paris.'])}, 'prompt_2': {'name': 'n_l_i_statement_prompt', 'input': NLIStatementInput(context='Paris, France is a city where Eifel tower is located', statements=['Eiffel tower is in Paris.']), 'output': NLIStatementOutput(statements=[StatementFaithfulnessAnswer(statement='Eiffel tower is in Paris.', reason='The context explicitly states that Eiffel tower is located in Paris, France.', verdict=1)])}, 'prompt_3': {'name': 'claim_decomposition_prompt', 'input': ClaimDecompositionInput(response='Paris, France is a city where Eifel tower is located'), 'output': ClaimDecompositionOutput(claims=['Paris is a city in France.', 'Eiffel Tower is located in Paris.'])}, 'prompt_4': {'name': 'n_l_i_statement_prompt', 'input': NLIStatementInput(context='Eifel tower is in Paris', statements=['Paris is a city in France.', 'Eiffel Tower is located in Paris.']), 'output': NLIStatementOutput(statements=[StatementFaithfulnessAnswer(statement='Paris is a city in France.', reason='The context does not provide any information about the location of Paris.', verdict=0), StatementFaithfulnessAnswer(statement='Eiffel Tower is located in Paris.', reason='The context explicitly states that the Eiffel Tower is in Paris.', verdict=1)])}}

@Vidit-Ostwal
Copy link
Contributor Author

Hi @sahusiddharth, can you check, is this because of the previous unit test PR you have made?

@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. and removed size:XS This PR changes 0-9 lines, ignoring generated files. labels Feb 4, 2025
@dosubot dosubot bot added size:XS This PR changes 0-9 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Feb 4, 2025
@jjmachan
Copy link
Member

jjmachan commented Feb 4, 2025

hey @Vidit-Ostwal this is great - much for readable
image

but one concern is that this is too big of a change and might break - can I keep this on pause till v0.3?

we are planning to maybe release that with another feature we wanted. This way we keep the honor semver best

@sahusiddharth also roping you in on this

@Vidit-Ostwal
Copy link
Contributor Author

@jjmachan, sure.
Can coordinate with @sahusiddharth as well, to resolve some conflicts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size:XS This PR changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Traces in evaluation result are only saving the last prompt per trace name
3 participants